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CLAIMS 

What is claimed is: 

1 . A method of processing outputs of an automatic system for probabilistic detection of 
events, comprising: 

collecting statistics related to observed outputs of the automatic system; and 
using the statistics to process an original output sequence of the automatic 

system and produce an alternate output sequence, by at least one of supplementing and 

replacing at least part of the original output sequence. 

2. A method as recited in claim 1 , wherein at least part of the alternate output sequence 
contains information that can be used by systems that can use the at least part of the original 
output sequence directly. 

3. A method as recited in claim 2, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of at least one of the original and alternate 
output sequences, where the confidence assessments supplement data in the original output 
sequence. 

4. A method as recited in claim 2, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of at least one of the original and alternate 
output sequences, where the confidence assessments replace at least part of the original output 
sequence. 

5. A method as recited in claim 1, wherein the alternate output sequence includes 
information of a plurality of alternatives that can replace at least part of the original output 
sequence that can be used by systems that can use the at least part of the original output 
sequence directly. 

6. A method as recited in claim 5, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of the alternatives, where the confidence 
assessments supplement data in the original output sequence. 
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7. A method as recited in claim 5, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of the alternatives, where the confidence 
assessments replace at least part of the original output sequence. 

8. A method as recited in claim 1, wherein said collecting comprises at least one of 
noting and estimating correctness of at least one event that the automatic system detected. 

9. A method as recited in claim 1 , wherein said collecting comprises at least one of 
noting and estimating at least one detectable event that has transpired in correspondence with 
at least part of the original output sequence produced by the automatic system. 

10. A method as recited in claim 1, wherein the detected events involve word 
recognition. 

11. A method as recited in claim 10, wherein the automatic system is an automatic 
speech recognition system. 

12. A method as recited in claim 11, 

wherein the automatic speech recognition system operates on low-grade audio 
signals having word recognition precision below 50 percent; and 

wherein said method further comprises utilizing human transcription of the low- 
grade audio signals as a source for data relating to the statistics being collected. 

13. A method as recited in claim 10, wherein the automatic probabilistic event detection 
system is an automatic character recognition system. 

14. A method as recited in claim 10, wherein the alternate output sequence includes at 
least one of 

an alternate recognition score for at least one of the words, 

at least one alternate word that may have been one detectable event that 

transpired, 

the at least one alternate word along with a recognition score for the at least one 

alternate word, 
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at least one alternate sequence of words that may have been another detectable 
event that transpired, 

the at least one alternate sequence of words along with a recognition score for at 
least one word that is part of the at least one alternate sequence of words, 

an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, 

and 

the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 

15. A method as recited in claim 1, wherein said using comprises: 

building a first model modeling behavior of the automatic system as a process 
with at least one inner state, which may be unrelated to inner states of the automatic system, 
and inferring the at least one inner state of the process from the observed outputs of the 
automatic system; 

building a second model, based on the statistics obtained by said collecting, to 
infer data to at least one of supplement and replace at least part of the original output sequence 
from the at least one inner state of the process in the first model; 

combining the first and second models to form a function for converting the 
original output sequence into the alternate output sequence; and 

using the function on the original output sequence of the automatic system to 
create the alternate output sequence. 

16. A method as recited in claim 15, further comprising repeating said using of the 
function on different original output sequences of the automatic system to create additional 
alternate output sequences. 

17. A method as recited in claim 15, wherein the process in said first model is one of a 
Generalized Hidden Markov process and a special case of a Generalized Hidden Markov 
process. 
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18. A method as recited in claim 15, 

wherein the second model is a parametric model, and 
wherein said building of the second model uses at least one direct parametric 
estimation technique for inferring from at least one of the inner states. 

19. A method as recited in claim 18, wherein the at least one direct parametric 
estimation technique includes at least one of maximal likelihood estimation and entropy 
maximization. 

20. A method as recited in claim 15, wherein for at least one of the inner states said 
building of the second model uses at least one estimation technique utilizing information 
estimated for other inner states. 

21 . A method as recited in claim 20, wherein the at least one estimation technique 
utilizes at least one of a mixture model and kernel-based learning. 

22. A method as recited in claim 15, wherein said building of the first and second 
models assumes the inner states of the process to be fully determined by the observed outputs 
during at least one point in time. 

23. A method as recited in claim 22, wherein said building of the first and second 
models assumes the inner states of the process during at least one point in time to be fully 
determined by a subset of the observed outputs that includes at least an identity of at least one 
event detected by the automatic system. . 

24. A method as recited in claim 15, wherein said building of at least one of the first and 
second models uses at least one discretization function. 

25. A method as recited in claim 15, wherein at least one of said building and combining 
uses Bayesian methods. 

26. A method as recited in claim 1 , further comprising repeating said collecting on 
several statistically different training materials. 
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27. A method as recited in claim 26, wherein said collecting uses samples of statistically 
different sets of materials as initial training material. 

28. A method as recited in claim 26, further comprising identifying parameters that 
remain invariant between the statistically different sets of materials. 

29. A method as recited in claim 28, wherein said identifying improves estimation of at 
least one of the parameters. 

30. A method as recited in claim 28, wherein said identifying is used to enable training 
when available statistically self-similar sets of materials are too small to allow effective training. 

31. A method as recited in claim 28, wherein said identifying is used to increase 
effectiveness of further training on material that is not statistically similar to initial training 
material. 

32. A method as recited in claim 1, wherein material used for said collecting is 
statistically similar to material used during said using. 

33. At least one computer readable medium storing instructions for controlling at least 
one computer system to perform a method of processing outputs of an automatic system for 
probabilistic detection of events, comprising: 

collecting statistics related to observed outputs of the automatic system; and 
using the statistics to process an original output sequence of the automatic 

system and produce an alternate output sequence, by at least one of supplementing and 

replacing at least part of the original output sequence. 

34. At least one computer readable medium as recited in claim 33, wherein at least part 
of the alternate output sequence contains information that can be used by systems that can use 
the at least part of the original output sequence directly. 
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35. At least one computer readable medium as recited in claim 34, wherein data in the 
alternate output sequence includes confidence assessments regarding parts of at least one of 
the original and alternate output sequences, where the confidence assessments supplement 
data in the original output sequence. 

36. At least one computer readable medium as recited in claim 34, wherein data in the 
alternate output sequence includes confidence assessments regarding parts of at least one of 
the original and alternate output sequences, where the confidence assessments replace at least 
part of the original output sequence. 

37. At least one computer readable medium as recited in claim 33, wherein the alternate 
output sequence includes information of a plurality of alternatives that can replace at least part 
of the original output sequence that can be used by systems that can use the at least part of the 
original output sequence directly. 

38. At least one computer readable medium as recited in claim 37, wherein data in the 
alternate output sequence includes confidence assessments regarding parts of the alternatives, 
where the confidence assessments supplement data in the original output sequence. 

39. At least one computer readable medium as recited in claim 37, wherein data in the 
alternate output sequence includes confidence assessments regarding parts of the alternatives, 
where the confidence assessments replace at least part of the original output sequence. 

40. At least one computer readable medium as recited in claim 33, wherein said 
collecting comprises at least one of noting and estimating correctness of at least one event that 
the automatic system detected. 

41. At least one computer readable medium as recited in claim 33, wherein said 
collecting comprises at least one of noting and estimating at least one detectable event that has 
transpired in correspondence with at least part of the original output sequence produced by the 
automatic system. 
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42. At least one computer readable medium as recited in claim 33, wherein the detected 
events involve word recognition. 

43. At least one computer readable medium as recited in claim 42, wherein the 
automatic system is an automatic speech recognition system. 

44. At least one computer readable medium as recited in claim 43, 

wherein the automatic speech recognition system operates on low-grade audio 
signals having word recognition precision below 50 percent; and 

wherein said method further comprises utilizing human transcription of the low- 
grade audio signals as a source for data relating to the statistics being collected. 

45. At least one computer readable medium as recited in claim 42, wherein the 
automatic probabilistic event detection system is an automatic character recognition system. 

46. At least one computer readable medium as recited in claim 42, wherein the alternate 
output sequence includes at least one of 

an alternate recognition score for at least one of the words, 

at least one alternate word that may have been one detectable event that 

transpired, 

the at least one alternate word along with a recognition score for the at least one 

alternate word, 

at least one alternate sequence of words that may have been another detectable 
event that transpired, 

the at least one alternate sequence of words along with a recognition score for at 
least one word that is part of the at least one alternate sequence of words, 

an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, 

and 

the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 
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47. At least one computer readable medium as recited in claim 33, wherein said using 
comprises: 

building a first model modeling behavior of the automatic system as a process 
with at least one inner state, which may be unrelated to inner states of the automatic system, 
and inferring the at least one inner state of the process from the observed outputs of the 
automatic system; 

building a second model, based on the statistics obtained by said collecting, to 
infer data to at least one of supplement and replace at least part of the original output sequence 
from the at least one inner state of the process in the first model; 

combining the first and second models to form a function for converting the 
original output sequence into the alternate output sequence; and 

using the function on the original output sequence of the automatic system to 
create the alternate output sequence. 

48. At least one computer readable medium as recited in claim 47, further comprising 
repeating said using of the function on different original output sequences of the automatic 
system to create additional alternate output sequences. 

49. At least one computer readable medium as recited in claim 47, wherein the process 
in said first model is one of a Generalized Hidden Markov process and a special case of a 
Generalized Hidden Markov process. 

50. At least one computer readable medium as recited in claim 47, 

wherein the second model is a parametric model, and 
wherein said building of the second model uses at least one direct parametric estimation 
technique for inferring from at least one of the inner states. 

51 . At least one computer readable medium as recited in claim 50, wherein the at least 
one direct parametric estimation technique includes at least one of maximal likelihood 
estimation and entropy maximization. 
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52. At least one computer readable medium as recited in claim 47, wherein for at least 
one of the inner states said building of the second model uses at least one estimation technique 
utilizing information estimated for other inner states. 

53. At least one computer readable medium as recited in claim 52, wherein the at least 
one estimation technique utilizes at least one of a mixture model and kernel-based learning. 

54. At least one computer readable medium as recited in claim 47, wherein said building 
of the first and second models assumes the inner states of the process to be fully determined by 
the observed outputs during at least one point in time. 

55. At least one computer readable medium as recited in claim 54, wherein said building 
of the first and second models assumes the inner states of the process during at least one point 
in time to be fully determined by a subset of the observed outputs that includes at least an 
identity of at least one event detected by the automatic system. 

56. At least one computer readable medium as recited in claim 47, wherein said building 
of at least one of the first and second models uses at least one discretization function. 

57. At least one computer readable medium as recited in claim 47, wherein at least one 
of said building and combining uses Bayesian methods. 

58. At least one computer readable medium as recited in claim 33, further comprising 
repeating said collecting on several statistically different training materials. 

59. At least one computer readable medium as recited in claim 58, wherein said 
collecting uses samples of statistically different sets of materials as initial training material. 

60. At least one computer readable medium as recited in claim 59, further comprising 
identifying parameters that remain invariant between the statistically different sets of materials. 

61 . At least one computer readable medium as recited in claim 60, wherein said 
identifying improves estimation of at least one of the parameters. 
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62. At least one computer readable medium as recited in claim 60, wherein said 
identifying is used to enable training when available statistically self-similar sets of materials are 
too small to allow effective training. 

63. At least one computer readable medium as recited in claim 60, wherein said 
identifying is used to increase effectiveness of further training on material that is not statistically 
similar to initial training material. 

64. At least one computer readable medium as recited in claim 33, wherein material 
used for said collecting is statistically similar to material used during said using. 

65. An apparatus for processing outputs of an automatic system for probabilistic 
detection of events, comprising: 

collection means for collecting statistics related to observed outputs of the 
automatic system; and 

processing means for using the statistics to process an original output sequence 
of the automatic system and produce an alternate output sequence, by at least one of 
supplementing and replacing at least part of the original output sequence. 

66. An apparatus as recited in claim 65, wherein the alternate output sequence includes 
information of a plurality of alternatives that can replace at least part of the original output 
sequence that can be used by systems that can use the at least part of the original output 
sequence directly. 

67. An apparatus as recited in claim 66, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of the alternatives, where the confidence 
assessments supplement data in the original output sequence. 

68. An apparatus as recited in claim 66, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of the alternatives, where the confidence 
assessments replace at least part of the original output sequence. 
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69. An apparatus as recited in claim 65, wherein the detected events involve word 
recognition. 

70. An apparatus as recited in claim 69, wherein the automatic system is an automatic 
speech recognition system. 

71. An apparatus as recited in claim 70, 

wherein the automatic speech recognition system operates on low-grade audio 
signals having word recognition precision below 50 percent; and 

wherein said method further comprises utilizing human transcription of the low- 
grade audio signals as a source for data relating to the statistics being collected. 

72. An apparatus as recited in claim 69, wherein the alternate output sequence includes 
at least one of 

an alternate recognition score for at least one of the words, 

at least one alternate word that may have been one detectable event that 

transpired, 

the at least one alternate word along with a recognition score for the at least one 

alternate word, 

at least one alternate sequence of words that may have been another detectable 
event that transpired, 

the at least one alternate sequence of words along with a recognition score for at 
least one word that is part of the at least one alternate sequence of words, 

an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, 

and 

the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 
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73. An apparatus as recited in claim 65, wherein said processing means comprises: 

first model means for building a first model modeling behavior of the automatic 
system as a process with at least one inner state, which may be unrelated to inner states of the 
automatic system, and inferring the at least one inner state of the process from the observed 
outputs of the automatic system; 

second model means for building a second model, based on the statistics 
obtained by said collection means, to infer data to at least one of supplement and replace at 
least part of the original output sequence from the at least one inner state of the process in the 
first model; 

combination means for combining the first and second models to form a function 
for converting the original output sequence into the alternate output sequence; and 

function means for applying the function to the original output sequence of the 
automatic system to create the alternate output sequence. 

74. An apparatus as recited in claim 73, wherein said function means applies the 
function on different original output sequences of the automatic system to create additional 
alternate output sequences. 

75. An apparatus as recited in claim 73, wherein the process in said first model is one of 
a Generalized Hidden Markov process and a special case of a Generalized Hidden Markov 
process. 

76. An apparatus as recited in claim 73, 

wherein the second model is a parametric model, and 
wherein said second model means uses at least one direct parametric estimation 
technique for inferring from at least one of the inner states. 

77. An apparatus as recited in claim 73, wherein said second model means, for at least 
one of the inner states, uses at least one estimation technique utilizing information estimated for 
other inner states. 

78. An apparatus as recited in claim 77, wherein the at least one estimation technique 
utilizes at least one of a mixture model and kernel-based learning. 
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79. A system for processing outputs of an automatic system for probabilistic detection of 
events, comprising: 

an interface to receive observed outputs from the automatic system; and 
at least one processor programmed to collect statistics related to the observed 
outputs of the automatic system and to use the statistics to produce an alternate output 
sequence by at least one of supplementing and replacing at least part of an original output 
sequence of the automatic system. 

80. A system as recited in claim 79, wherein at least part of the alternate output 
sequence includes information of a plurality of alternatives that can replace at least part of the 
original output sequence that can be used by systems that can use the at least part of the 
original output sequence directly. 

81. A system as recited in claim 79, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of the alternatives, where the confidence 
assessments supplement data in the original output sequence. 

82. A system as recited in claim 79, wherein data in the alternate output sequence 
includes confidence assessments regarding parts of the alternatives, where the confidence 
assessments replace at least part of the original output sequence 

83. A system as recited in claim 79, wherein the detected events involve word 
recognition. 

84. A system as recited in claim 83, wherein the automatic system is an automatic 
speech recognition system. 

85. A system as recited in claim 84, 

wherein the automatic speech recognition system operates on low-grade audio 
signals having word recognition precision below 50 percent; and 

wherein said interface further receives human transcription of the low-grade 
audio signals for use by said processor as data relating to the statistics being collected. 
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86. A system as recited in claim 83, wherein the alternate output sequence includes at 
least one of 

an alternate recognition score for at least one of the words, 

at least one alternate word that may have been one detectable event that 

transpired, 

the at least one alternate word along with a recognition score for the at least one 

alternate word, 

at least one alternate sequence of words that may have been another detectable 
event that transpired, 

the at least one alternate sequence of words along with a recognition score for at 
least one word that is part of the at least one alternate sequence of words, 

an indication that no detectable event has transpired, 

a word lattice describing a plurality of alternatives for detectable word sequences, 

and 

the word lattice along with a recognition score for at least one among 
at least one word in the detectable word sequences, 
at least one path in the word lattice, and 
at least one edge in the word lattice. 

87. A system as recited in claim 79, wherein said processor is programmed to build a 
first model modeling behavior of the automatic system as a process with at least one inner 
state, which may be unrelated to inner states of the automatic system, and inferring the at least 
one inner state of the process from the observed outputs of the automatic system, to build a 
second model, based on the statistics obtained, to infer data to at least one of supplement and 
replace at least part of the original output sequence from the at least one inner state of the 
process in the first model, to combine the first and second models to form a function for 
converting the original output sequence into the alternate output sequence, and to apply the 
function to the original output sequence of the automatic system to create the alternate output 
sequence. 
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88. A system as recited in claim 87, wherein said processor applies the function on 
different original output sequences of the automatic system to create additional alternate output 
sequences. 

89. A system as recited in claim 87, wherein the process in said first model is one of a 
Generalized Hidden Markov process and a special case of a Generalized Hidden Markov 
process. 

90. A system as recited in claim 87, 

wherein the second model is a parametric model, and 
wherein said processor builds the second model using at least one direct parametric 
estimation technique for inferring from at least one of the inner states. 

91 . A system as recited in claim 87, wherein said processor, for at least one of the inner 
states, uses at least one estimation technique utilizing information estimated for other inner 
states. 

92. A system as recited in claim 91 , wherein said processor is programmed to utilize at 
least one of a mixture model and kernel-based learning as the at least one estimation 
technique. 
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